CULTURESAMPO –A Collective Memory of Finnish Cultural Heritage on the Semantic Web 2.0
نویسندگان
چکیده
This paper presents the SemanticWeb 2.0 application CULTURESAMPO, an ambitious system of creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic challenge of aggregating highly heterogeneous, cross-domain cultural heritage into a semantically rich intelligent system for human and machine users. At the same time, CULTURESAMPO is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens. 1 Components of a National Semantic Memory In our view, a cultural heritage memory on the Semantic Web 2.0 should be built on three pillars: First we need a cross-domain content infrastructure of ontologies, metadata standards, and related services, that is developed and maintained on a global level through collaborative local efforts. Second, the process of producing ontologically harmonized metadata should be organized in a collaborative fashion, where distributed content producers are able to create semantically correct annotations cost-efficiently through centralized services. Third, the contents should be made available to human end-users and machines thought intelligent search, browsing, and visualization techniques in a portal. For machines, easy to use mash-up APIs and web services should be available. In this way, the collaboratively aggregated, semantically enriched national memory can be exposed and reused easily as services in other portals and applications in the same vein as Google Ads or Maps. CULTURESAMPO [1], released on the public web on September 25, 2008, is an operational demonstration on a national Finnish level of implementing of such a semantic collective memory. The system consists of three components corresponding to the three pillars above: 1 http://code.google.com/apis/maps/ 2 http://www.kulttuurisampo.fi/ 1. National cross-domain content infrastructure FinnONTO The basis of CULTURESAMPO is the national FinnONTO infrastructure [2, 3] that includes a collaboratively created system of cross-domain ontologies and related ontology services for utilizing them cost-efficiently [4]. The ontologies and the services were released in several languages on the public web as the National Ontology Service ONKI on September 12, 2008. 2. A content creation process Our model consists of a set of metadata models and a content creation process for producing and harvesting content from museums, libraries, archives and other organizations, as well as from individual citizens and Web 2.0 sources, such as Wikipedia and Panoramio. 3. Semantic Web 2.0 portal CULTURESAMPO The portal itself is unique in its use of versatile cross-domain semantic models, new semantic searching and browsing methods, and semantic visualizations for the end-users, for both humans and machines. In the following these three components are explained in more detail. 2 A Collaborative Ontology Infrastructure An integral part of CULTURESAMPO are the ontologies and services of the FinnONTO infrastructure [2, 3]. The general idea of the FinnONTO approach is to extend the generic, logic based W3C recommendations with domain specific ontologies in different domains. Most of the FinnONTO ontologies were developed by transforming nationally used thesauri into light weight ontologies. The process was not mechanical like e.g. in [5], but manual processing was required in order to refine the semantic thesaurus relations into full blown subsumption hierarchies. In the FinnONTO model, the ontologies are developed in a distributed fashion by collaborating expert groups of different fields, and are mapped together to form a large national ontology called KOKO encompassing all domains. At the moment, KOKO includes an upper ontology YSO (20 600 concepts), a museum ontology MAO (6800 concepts), an agriforestry ontology AFO (5500 concepts), an applied art ontology TAO (2600 concepts) and a photography ontology VALO (1900 concepts). The ontologies are provided to end-users not only in the RDF/OWL form, as usual, but as ready to use semantic web widgets [6] using Web 2.0 AJAX APIs, and through conventional Web Services [4]. The KOKO ontology as a single whole is used by developers and end-users, and is the ontological basis of CULTURESAMPO. In addition to KOKO, CULTURESAMPO also utilizes a geographical registry of 800 000 places in Finland, a spatio-temporal ontology of Finnish counties 1865– 2007 [7], an ontology of persons and organizations, and ontologized international systems such as the Iconclass and the Union List of Artists Names (ULAN). 3 http://www.yso.fi/ 4 http://www.wikipedia.org/ 5 http://www.panoramio.com/ 6 http://www.iconclass.nl/ 7 http://www.getty.edu/research/conducting research/vocabularies/ulan/ 3 Content Creation Process CULTURESAMPO contains cultural objects of 26 different content types: artifacts, paintings, drawings, sculpture, abstract art, novels, comics, web pages, three types of folklore, five types of folk music, photos, aerial photos, persons, organizations, biographies, historical events, skills, videos, buildings, and archeological sites. These content types are represented using 16 different metadata schemas. The aggregated knowledge base contains (Sept 26, 2008) 52 267 cultural objects and 234 597 other resources, such as ontological class concepts and place instances. The cultural objects are described by 624 021 property triples. The content is enriched using reasoning, resulting in 5 844 153 property triples. The enriched knowledge base is used for intelligent information retrieval and for creating semantic recommendation links between objects. In addition, there are 1 194 230 pieces of data related to reference resources. The content is represented using RDF and OWL, and SPARQL is used for recommendations. The system also utilizes external web resources: all Wikipedia articles (in English and Finnish) that have coordinate information, as well as photographs from the Panoramio service can be found on CULTURESAMPO’s map views. These information sources have diverse ownerships. The contents come from 21 museums, archives, and libraries, most of which produce their contents independently from each other using heterogeneous cataloging systems and practices, e.g. different vocabularies. Wikipedia and Panoramio content is created internationally by the public. CULTURESAMPO also has an internal commenting facility by which individuals can contribute new knowledge to individual content items, e.g. identify persons in an old photograph of a museum collection. In these ways, citizens are able to contribute to the national semantic memory. Furthermore, interactive content production based on the SAHA editor [8] is being implemented in the system—this content creation channel has already been used internally in the system by participating organizations. From a semantic modeling viewpoint, a research focus of our work has been eventand process-based annotations used in artificial intelligence and knowledge representation [9]. In our case, events have been used for modeling cultural processes and narrative stories [1, 10] and for metadata schema integration [11, 12]. The KOKO ontology was designed to support this by clearly separating events and processes from the other concepts along the model of Dolce [13]. In some metadata schemas of CULTURESAMPO it is possible to annotate content using processes in terms of events, subevents and their sequences; the model in use in the prototype is a simplified version of our earlier model [10]. The portal then automatically generates an interactive representation of the process as a kind of a temporal table of contents. This system is used in the prototype for creating skill descriptions, cultural process description, and documentation of processes in videos: 1. Semantic skill models. An example of a skill model is the model “Production of Ceramics” produced by experts at the University of Applied Arts in Helsinki. It illustrates and explains the composition of different work phases when manufacturing ceramics. At each phase, semantic recommendations to relevant CULTURESAMPO contents can be created. For example, links to products in collections that were manufactured using the same techniques, are automatically obtained. 2. Semantic models of cultural processes. There is a similar kind of chronological model “A Year on a Farm” of the seasonal events and processes taking place at a typical farm in Finland. Again, tools and other materials from CULTURESAMPO are linked automatically with related events. This model was created by a farmer employed at the Finnish Museum of Agriculture. The exhibition of this museum is actually organized using the same idea of presenting farming events taking place during different yearly seasons. 3. Semantic documentation on videos. The annotation model can be applied also to documenting instances of actual skill events or processes documented on a video. The case example available on the portal describes how the shoemaker Onni Wirlander manufactured a pair of traditional leather boots. The video was produced, and the actual annotation created, by the Espoo City Museum using the SAHA editor connected to the ONKI ontology services. The semantic model describes what happens at different (sub)sequences on the video. Semantic search can find not only the video as a black box, as in systems such as YouTube, but also points of interest inside the video. The video can be viewed directly starting from different points of interest. This is important with longer videos. The Wirlander video e.g. lasts over 20 minutes. When watching the video, the recommendation system creates dynamically, for each subsequence separately, recommendation links to materials of interest in the portal, such as tools related to the sequence. The examples above demonstrate our new idea of representing and storing immaterial, procedural cultural heritage in the memory system, here descriptions of handicraft skills and cultural processes. Typical cultural heritage portals contain metadata only about concrete objects. Content creation in CULTURESAMPO, both ontologies and the metadata, is based on distributing the work to organizations and citizens in a Web 2.0 fashion. In this strategy, extra costs can be minimized by reorganizing the work done already in the organizations and in public. The work is supported by a number of generic FinnONTO tools, such as the metadata editor SAHA, information extraction tool POKA, and the semantic content validator VERA. The contents are published with a most liberal open source licensing policy where only the mentioning of the source developers is required. Several organizations that have been developing traditional thesauri are now consedering to start developing light weight ontologies within the FinnONTO framework. 4 The Semantic Web 2.0 Portal CULTURESAMPO The portal is an end-user application for both 1) human users and 2) machines: Human User Interface Human clients include the public and researchers, who can use the system for not only finding objects but also for answering questions and analyzing the results along ontological classifications. The system is multilingual (Finnish, Swedish, and English). However, because most of the ontologies and contents are available only in Finnish, the system is not equally powerful in other languages. 8 http://www.seco.tkk.fi/tools/poka/ 9 http://www.seco.tkk.fi/services/vera/ In the human interface, there is always available an easy to use Google-like search input field for typing in any query. However, semantic autocompletion [14] is used, semantic query expansion and reasoning is employed during search, and the underlying ontologies are used to classify and organize the results into meaningful categories. For example, by searching with a persons name, the results are categorized by the roles connecting the person and matched objects, e.g., paintings created by her, sculpture depicting her (but created by someone else), biography or Wikipedia page telling about her etc. A major novelty that the portal provides is the access to contents through nine “thematic perspectives” available at the main entry page and as choices in the menu bar: Map views. There are 4 map views available using Google Maps [7]: One for viewing all objects and filtering them in terms of their relation to the places; one for finding old Finnish counties with digitized limits; one for viewing historical maps layered semi-transparently over modern Google maps ; one for finding nearby objects of interest. It is also possible to find Wikipedia articles and Panoramio photos by using the map interfaces. Relational search. This view is a demonstration of relational search [15], where the idea is not to search for objects but associative relation chains between objects. We used the ULAN registry of 120 000 artists and organizations with 390 000 names. The user types in two names, using semantic autocompletion, and CULTURESAMPO tells how the persons or organizations are related with each other by the social network based on some 50 different social roles (e.g., parentof, teacher-of, patron-of etc.). The underlying social RDF/OWL network can also be browsed by a graphical network browser. Domain-centric faceted search. In this view, a new kind of generalized version of the faceted-search paradigm has been developed, called domain centric search [16]. In this paradigm, domain ontologies and the myriad of different properties and roles related to the 16 metadata schemas in use, can be used for very flexible queries and for analysis of contents based on classifying the search results. This advanced search facility is useful e.g. in analyzing the contents. For example, it easy to find our what themes were most popular in the Finnish novels publishes in 2007, or what colors were used in different textile types during different time periods. Collections view. Here the contents can be accessed based on an organizational view. Each participating organization has an automatically generated home page in the system with links to subcollections and the actual collection items. Finnish history view. This view is based on an ontology representing events in the Finnish history. These events are of interest of their own, but are also used to create semantic recommendations to other CULTURESAMPO contents, e.g. to biographies of persons participating in the events. Skills and processes. This view is used for finding cultural procedural descriptions in the system, i.e. semantic models of skills, processes, and documentations on videos. In the prototype, there is one example of each type available. Biographies. In this view biographies of the National Biography are used to access CULTURESAMPO contents. When reading biographies, related contents are shown based on the concepts extracted from the text using the information extraction tool POKA. Semantic Kalevala. This view [17] contains a semantically annotated version of the national epic of Finland, Kalevala, that is related in many ways to Finnish art and culture. The epic also has interesting links to old Finnish folklore, on which it is actually based, and to folk music lyrics. Thousands of runes and pieces of folk music are available in the portal. When reading Kalevala, annotations related to its subsequences can be viewed to help reading, and semantic recommendations to related materials in CULTURESAMPO are automatically produced. Karelia view. This view containsWikipedia articles about the Karelia area in Finland that has been influential to Finnish culture. For example, lots of folklore has been collected from this area, and the Kalevala epic is strongly associated there. Like in the biographies view, the POKA system is used for extracting concepts from the texts (here web pages) and for generating related semantic recommendations for more information. Machines as End-users In addition to human end-users, the system can be used by machines via AJAX interface. The idea is that collaboratively aggregated and semantically enriched national knowledge base can be used not only by the CULTURESAMPO portal but also by other portals and systems on the web. The key idea here is to use cost-effective, ready to use Web 2.0 mash-up services in the same spirit as Goolge Ads or Maps are used on external web pages and applications. In this way, museums, libraries, tourism portals, news papers, individual citizens, and other users can include CULTURESAMPO materials, such as semantic search results and recommendation links, on their web pages using mash-ups. This is a clear win-win situation to everybody: the materials of the CULTURESAMPO collaborative network get more visibility and the external users can enrich their own materials for ”free”, only on 1-2 lines of Javascript code is needed. We envision, that in the near future applications of CULTURESAMPO through its AJAX-interface will be available not only in other cultural heritage portals, but in commercial systems such as tourism and newspaper portals, too. In our own work GPS based mobile and navigation services are being developed, e.g., as a part of the SmartMuseum EU-project. 5 Discussion and Conclusions The vision and implementation of CULTURESAMPO goes beyond current semantic web portals for cultural heritage [18], rewarded e.g. at the Semantic Web Challenge competition before (MuseumFinland 2004 [19], MultimediaN 2006 [20], and CHIP Demonstrator 2007 [21]. The novelty of the CULTURESAMPO system has many facets: 1) It is highly cross-domain with 26 content types and 16 metadata schemas (usually only one schema such as Dublin Core or VRA is used), 2) it makes use of sophisticated semantic annotation models including events and processes, 3) it uses new kind of semantic search and recommendation techniques, 4) it has exceptionally versatile selection of semantic visualizations available (different map views, timelines, graphs, process visualization, semantic video viewing) 5) it is based on a large nation wide collaboratively maintained infrastructure of ontologies and ontology services, 6) it includes a model of and tools for collaborative semantic content creation, and 7) the services are available for machines, too. 10 http://www.smartmuseum.eu/ Two user evaluation studies concerning semantic recommendations of an earlier version of CULTURESAMPO have been performed with some promising results [11]. Formal evaluation of the now finished prototype with its various new features can now be started. The portal is scalable in terms of the Web 2.0 content creation model, different types of content, and in terms computational complexity and the number of simultaneous users. The implementation is based on conventional search engine technology (Apache Lucene) that does semantic search using semantic indexing. Acknowledgements In total 36 researchers, including, Airi Hortling, Jouni Hyvönen, Ellen Karhulampi, Suvi Kettula, HelenaMäkimattila, and Jari Väätäinen, have contributed to developing CULTURESAMPO contents and the related infrastructural components, excluding most local work at memory organizations. This work is a part of the national FinnONTO research project 2003–2007, 2008– 2010, funded mainly by Tekes and a consortium of 38 companies and public organizations. The work is also partly funded by the EU FP7 SmartMuseum project 2008–2010, and the Finnish Cultural Foundation (2008–2010).
منابع مشابه
CultureSampo—Finnish Culture on the Semantic Web 2.0: Thematic Perspectives for the End-user
We present an overview of CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage collections and other contents into a semantically rich ...
متن کاملTHE 1ST INTERNATIONAL SYMPOSIUM ON DIGITAL HUMANITIES FOR JAPANESE ARTS AND CULTURES (DH-JAC2009), RITSUMEIKAN 1 CultureSampo —Finnish Cultural Heritage Collections on the Semantic Web 2.0
This paper presents an overview of the SemanticWeb 2.0 application CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage content into a ...
متن کاملCultureSampo - Finnish Culture on the Semantic Web
This paper presents the semantic portal “CULTURESAMPO—Finnish Culture on the Semantic Web”1 [1, 2]. The portal provides memory organizations and other cultural content publishers with a national, shared semantic publication channel for heterogenous cultural contents. The content comes from over ten organizations and is annotated using various ontologies of the FinnONTO infrastructure [3]. For t...
متن کاملHow to deal with massively heterogeneous cultural heritage data - lessons learned in CultureSampo
This paper presents the CultureSampo system from the viewpoint of publishing heterogeneous linked data as a service. Discussed are the problems of converting legacy data into linked data, as well as the challenge of making the massively heterogeneous yet interlinked cultural heritage content interoperable on a semantic level. In the approach described, the data is published not only for human u...
متن کاملCultureSampo: A National Publication System of Cultural Heritage on the Semantic Web 2.0
CULTURESAMPO is an application demonstration of a national level publication system of cultural heritage contents on the Web, based on ideas and technologies of the Semantic (Web and) Web 2.0. On the semantic side, the system presents new solutions to interoperability problems of dealing with multiple ontologies of different domains, and to problems of integrating multiple metadata schemas and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008